In silico screening of some compounds derived from the desert medicinal plant Rhazya stricta for the potential treatment of COVID-19

The latest coronavirus pandemic (SARS-CoV-2) poses an exceptional threat to human health and society worldwide. The coronavirus (SARS-CoV-2) spike (S) protein, which is required for viral–host cell penetration, might be considered a promising and suitable target for treatment. In this study, we utilized the nonalkaloid fraction of the medicinal plant Rhazya stricta to computationally investigate its antiviral activity against SARS-CoV-2. Molecular docking and molecular dynamics simulations were the main tools used to examine the binding interactions of the compounds isolated by HPLC analysis. Ceftazidime was utilized as a reference control, which showed high potency against the SARS-CoV-2 receptor binding domain (RBD) in an in vitro study. The five compounds (CID:1, CID:2, CID:3, CID:4, and CID:5) exhibited remarkable binding affinities (CID:1, − 8.9; CID:2, − 8.7; and CID:3, 4, and 5, − 8.5 kcal/mol) compared to the control compound (− 6.2 kcal/mol). MD simulations over a period of 200 ns further corroborated that certain interactions occurred with the five compounds and the nonalkaloidal compounds retained their positions within the RBD active site. CID:2, CID:4, and CID:5 demonstrated high stability and less variance, while CID:1 and CID:3 were less stable than ceftazidime. The average number of hydrogen bonds formed per timeframe by CID:1, CID:2, CID:3, and CID:5 (0.914, 0.451, 1.566, and 1.755, respectively) were greater than that formed by ceftazidime (0.317). The total binding free energy calculations revealed that the five compounds interacted more strongly within RBD residues (CID:1 = − 68.8, CID:2 = − 71.6, CID:3 = − 74.9, CID:4 = − 75.4, CID:5 = − 60.9 kJ/mol) than ceftazidime (− 34.5 kJ/mol). The drug-like properties of the selected compounds were relatively similar to those of ceftazidime, and the toxicity predictions categorized these compounds into less toxic classes. Structural similarity and functional group analyses suggested that the presence of more H-acceptor atoms, electronegative atoms, acidic oxygen groups, and nitrogen atoms in amide or aromatic groups were common among the compounds with the lowest binding affinities. In conclusion, this in silico work predicts for the first time the potential of using five R. stricta nonalkaloid compounds as a treatment strategy to control SARS-CoV-2 viral entry.


Materials and methods
Collection and preparation of plant samples. Rhazya stricta was collected from its natural habitat in the desert; in the Al Gholah region near Asfan Road (21.9684537, 39.2675785), Jeddah Province. A voucher specimen was deposited in the Department of Biological Sciences Herbarium at King Abdulaziz University (number 1150/M/75; collected by N. Baeshen, M. Baeshen, and J. Sabir). The plant material was taken to the laboratory, and the leaves were cut and washed with running water to remove the dust and left to dry in the laboratory at room temperature. A week later, the dry leaves were ground into a fine powder for the extraction of compounds and biochemical analysis. The authors confirm that the experiments performed on the plants in the present study comply with international and national guidelines.
Alkaloids and nonalkaloids were extracted from R. stricta as described by 10 . In brief, ten grams of plant material was weighed into a clean volumetric flask, and 20 ml of absolute ethanol (99%) was added. The mixture was allowed to sit in a refrigerator (4 °C) for two days. The ethanol was removed by placing the mixture over Whatman filter paper (0.45 µm) and drying the plant material with nitrogen gas. After that, 5 g of plant material was transferred to a clean volumetric flask, and 40 ml of 1 mol/L HCl and 40 ml of HPLC-grade chloroform were added. The chloroform layer was collected and filtered through a PTFE disk filter and then transferred to an LC vial to analyze the nonalkaloids. For the alkaloids, sodium hydroxide was added to the plant mixture to adjust the pH, and then 40 ml of HPLC-grade chloroform was added. The chloroform layer, which contained the alkaloids, was filtered through a PTFE disk filter and transferred to a liquid chromatography vial for analysis.

HPLC-MS/MS analysis and data processing.
This study was performed using an HPLC-MS/MS system that included an ACQUITY UPLC I-Class (Waters Technologies, USA) instrument coupled to a 6500 Qtrap (AB Sciex, Canada). Chromatographic separation was performed using a Zorbax XDB C18 column (2.1 × 150 mm, 3.5 µm) with a temperature maintained at 40 °C, a flow rate of 300 µL/min and an injection volume of 10 µL. Solvents A (0.1% formic acid in HPLC grade water) and B (0.1% formic acid in HPLC grade acetonitrile) were used as the mobile phases. The linear elution gradient was as follows: 2% B (from 0 to 2), 95% B (from 2 to 24), 95% B (held for 2 min), and 4 min of equilibration. The electrospray ionization mass spectrometry (ESI-MS) data were collected in positive mode (ES+) with an electrode voltage of 5500 V, a declustering potential (DP) of 90 V, collision energy of 30 V, and an input potential of 10 V. Nitrogen was used as the nebulizer gas and curtain gas at 30 psi. Mass spectrometry (MS) spectra were acquired in the mass range of 100-900 m/z, and a scan rate www.nature.com/scientificreports/ of 1000 was used to search for enhanced production. For MS-MS data collection, the acquisition rate was set to 1 spectrum per second with a scan range of 50-1000 m/z in automode according to 11 . Upon completion of data collection, the HPLC-MS data files were downloaded in wiff format and then converted to Mzml using MSConvert (ProteoWizard 3.0.20270) 12 . Mzmine (version 2.53) software was used to analyze the results 13 (https:// github. com/ mzmine/ mzmin e2/ relea ses/ tag/ v2. 53). Following data import into Mzmine2, a minimum intensity cutoff of 1,000 was used, and the retention time was set to a tolerance of 0.2 min. Then, the adjusted peaks were compiled into a single mass list to enable detection and comparison. The identification process was performed using the SIRIUS platform coupled with CSI:FingerID for molecular structure identification 14-16 . Ligands and protein preparation. Hundreds of nonalkaloid compounds were identified in R. stricta by HPLC and MS/MS analyses. The structures of the R. stricta nonalkaloid compounds were discovered through HPLC analysis, downloaded from the PubChem database (https:// pubch em. ncbi. nlm. nih. gov/) as SDF files, and then converted into PDB files using PyMOL (Schrödinger, LLC) 17, 18 . The AutoDock 4.2 graphical interface 19 was used to prepare the ligands by adding the computed Gasteiger charge and merging the nonpolar hydrogen atoms. Aromatic carbon rings were also detected to set up a torsion tree. The compound files were saved in pdbqt format for further screening. The crystal structure of the SARS-CoV-2 spike receptor-binding domain (RBD) complexed with ACE2 was retrieved from the RCSB (https:// www. rcsb. org) with PDB ID 6m0j 21 , consisting of 193 amino acids for the RBD and 596 amino acids for ACE2 with a resolution of 2.45 Å. The RBD was prepared by deleting the ACE2 atom coordinates and removing ligands, water, and ionic metals from the complex structure. Nonpolar hydrogen atoms were merged, and Kollman charges were added to the RBD. The prepared protein was saved as a pdbqt file.

Virtual screening. AutoDock
Vina is an open-source program that predicts the binding positions of small molecules into the cavity of interest of a target protein in addition to the small molecule binding affinities 22 . The grid box was concentrated with dimensions (50X, 64Y, 22Z) and a grid point of 0.375 Å on the interface of the RBD, which contains the involved residues in connection with the ACE2 receptor (K417, G446, Y449, Y453, L455, F456, A475, F486, N487, Y489, Q493, G496, Q498, T500, N501, G502, Y505). AutoDock Vina was run using an in-house Python script, which prompted the system to the configuration file containing the screening parameters. We set the exhaustiveness of the search to 8, and 9 conformation modes were generated for each compound.
Molecular docking. The best-docked compounds (~ 60), which had the lowest binding affinities and RMSD values of 0 Å into the RBD interface, were subjected to analysis by ACD/ChemSketch to check for tautomeric forms 23 . Geometry optimization was also conducted using Avogadro software, which has an auto-optimization tool 24 . Structure optimization and energy minimization were performed by the universal force field (UFF) and the steepest descent algorithm. To find accurate conformations and positions, we redocked the ligand-protein complexes using AutoDock Vina with the grid box dimensions listed above the exhaustiveness of the search set to 30. Finally, we concentrated the grid box on the lowest binding affinity position of the best 5 compounds and redocked the complexes. To compare our results with those of a drug that has been confirmed to show in vitro anti-SARS-CoV-2 spike protein activity, we found that the antibiotic ceftazidime, which is used therapeutically to treat bacterial pneumonia, revealed the highest potency among several compounds, with an inhibition rate of approximately 80% and an IC50 of 28 µM 25 . Biovia Discovery Studio Visualizer was used to plot the compounds into the pocket residues and PyMOL 17,26 .

Molecular dynamics simulations. MD simulations were performed using the Groningen Machine for
Chemical Simulations (GROMACS) 2020.3 27 . The procedure was conducted as described 28 . To create the structural topology, the compound-protein complex coordinates were separated into two PDB files. The protein force field parameter was generated using the March 2019 version of Chemistry at Harvard Macromolecular Mechanics (CHARMM 36) 29 , whereas the ligand was converted to Mol file format using Avogadro 24 , and the force field parameters were generated using the official CHARMM General Force Field server (CGenFF: 30. The system was contained within a dodecahedron box with periodic boundary conditions. The box was then solvated using a three-site transferable water model (TIP3P) 31 , which was subsequently neutralized with Na and Cl ions. A 5,000-step structural optimization using the steepest descent algorithm followed by a 100 ps equilibration in the NVT and NPT ensembles utilizing a V-rescale thermostat 32 and a Berendsen barostat 33 for temperature and pressure coupling were carried out to minimize system energy. The van der Waals and electrostatic interaction cutoffs were fixed at 1.2 nm, and long-range electrostatic interactions were calculated using the particle mesh Ewald method 34 . The temperature and pressure were maintained at 300 K and 1 bar for the production run. A V-rescale thermostat 32 with a time constant of 0.1 ps was used to regulate the temperature, while a Parrinello-Rahman barostat 35 with a time constant of 1 ps and compressibility of 4.5 × 10 -5 bar-1 was employed to the control pressure. The simulation was run for 200 ns, and every 10 ps, the energy and coordinates of the trajectories were recorded. The stabilities of the resultant trajectories were then analyzed using the root mean square deviation (RMSD), the root mean square fluctuation (RMSF), the hydrogen bond number, and the final trajectory. The plotting tool Xmgrace was utilized to plot the simulation data 36  www.nature.com/scientificreports/ module. The g_mmpbsa scripts utilized were widely sourced from the GROMACS machine and APBS packages to integrate the molecular dynamics simulation trajectories with the binding energy calculations. A total of 500 snapshots of the complex trajectories were obtained. Electrostatic interactions, van der Waals interactions, polar salvation energy, and nonpolar solvation energy were calculated using the g_mmpbsa module 37,38 .

Drug-likeness analysis.
To investigate the oral bioavailability of the top 5 compounds, Lipinski's rule of five 39 was applied, which considers the following parameters: molecular weight, lipophilicity, number of hydrogen bond donors, and number of nitrogen and oxygen atoms. To predict the drug-likeness profile of the compounds with the lowest binding affinities from the docking results, the PDB files of the compounds were submitted to SwissADME, which is a webserver that calculates the physiochemical properties of compounds and applies drug-likeness parameters 40 . Moreover, the results were compared to the injectable antibiotic ceftazidime.

Toxicity risk prediction.
ProTox-II is a website that predicts the acute oral toxicities of small molecules and places them into classes according to the globally harmonized system of classification and labelling of chemicals (GHS) 41 . Canonical Smiles of the compounds were obtained from the PubChem database and inputted into the webserver. The compound toxicity classes and LD50 predictions were generated as output results.
Structural skeleton similarity analysis. To find the common structural skeleton similarity of the compounds of interest, we utilized DataWarrior software, which is a multipurpose chemical data visualization and analysis tool that is interactive and chemistry-aware 42 . Forty compounds with a 7.1 (kcal/mol) binding affinity cutoff were submitted to DataWarrior for structure analysis. We counted and analyzed the presence of 8 structural parameters: aromatic carbon atoms, carbo rings, electronegative atoms, H-acceptor atoms, H-donor atoms, heterorings, ring closures, and rotatable bonds.

Results and discussion
HPLC-MS/MS analysis. The nonalkaloids from R. stricta were isolated using HPLC. Separation using HPLC confirmed the existence of nine nonalkaloid compounds and seven additional alkaloids as minor components (Fig. 1).
Virtual screening and molecular docking. Virtual screening is a fast-scanning method that can dock a library of ligands into the active site of the protein of interest and reduce the library by eliminating those with the highest binding energies. In our study, after we screened the compounds obtained from the phytochemical analysis, the compounds with the lowest binding affinity were optimized, redocked, and prepared for further analysis.
The involved residues and number of bonds formed by the 5 best compounds (Table 1) and the compound control ceftazidime are presented in Table 2. Compound CID:1 demonstrated a binding affinity of − 8.9 (kcal/mol) and formed six conventional hydrogen bonds with RBD interface residues ARG346, ASN448, LYS444, GLN493, and SER494. In addition, four hydrophobic interactions were observed of the types pi-pi stacked, pi-alkyl, and alkyl in contact with PHE490, TYR449, and LYS444 residues. Compound CID:2 showed a binding affinity of − 8.7 (kcal/mol) and formed seven hydrogen bonds with residues TYR449, GLN493, SER494, ARG403, and TYR453. Moreover, six hydrophobic interactions of the types pi-pi stacked, pi-alkyl and alkyl involved with the SARS-CoV-2 RBD residues PHE490, LEU452, and LYS417. Compounds CID:3, CID:4, and CID:5  www.nature.com/scientificreports/ showed binding affinities of − 8.5 (kcal/mol) with various numbers of hydrogen bonds and other interactions. Compound CID:3 formed six hydrogen bonds with the active site residues TYR449, GLN493, and GLY496. In addition, four hydrophobic interactions of the types T-shaped pi-pi, amide pi-stacked, alkyl interactions with TYR449, TYR505, LEU452, and GLN493, and two halogen bonds with GLN493 and LEU492. Compound CID:4 formed five hydrogen bonds with the RBD binding site residues GLN493, SER494, and GLY496 and six hydrophobic interactions of the types-pi-pi stacked, pi-alkyl, and alkyl with key residues ARG403, TYR449, PHE497, TYR505, and TYR495. Compound CID:5 formed seven hydrogen bonds with residues ARG403, TYR453, GLY496, and GLN498 and seven hydrophobic interactions of the types T-shaped pi-pi, pi-alkyl, and alkyl with residues TYR449, TYR505, LYS417, and LEU455. In addition, one pi -cation electrostatic interaction was formed with ARG403. Ceftazidime was considered a control, giving a binding affinition of − 6.2 (kcal/mol), and formed six conventional hydrogen bonds with residues ARG403, TYR453, GLY496, TYR449, and GLN498. In addition, attractive electrostatic forces, one pi -alkyl interaction with residues ARG403 and TYR449, and hydrophobic interactions were involved in the ceftazidime-RBD interaction. In addition to the mentioned contacts, the six compounds interacted with the interface residues by van Der Waals forces (Figs. 2 and 3). In addition to the reference compounds, the five compounds formed many bonds with key residues LYS417, TYR453, TYR505, ASN501 GLN493, GLY496, TYR449, GLN498, and LEU455 of the SARS-CoV-2 RBD, which may interrupt the viral host cell recognition process. The 5 phytochemical compounds investigated through molecular docking simulations here revealed significantly better binding energy. In contrast to the reference compounds, these five compounds established networks of hydrophobic interactions that contribute to the binding affinity of the predicted complexes. RMSD and RMSF. The root mean square deviation (RMSD) is a critical measure for analyzing the equilibration of MD trajectories and determining the stability of protein-ligand complex systems during the simulation process. This value was calculated for each compound and compared to that of ceftazidime with respect to the initial pose as a reference frame. All compounds retained their docking position with some deviation (Fig. 4). Ceftazidime showed insignificant deviation (~ 0.18-0.27 nm) and high stability with rare major vari-  Hydrogen bonds. The number of hydrogen bonds that a compound forms with protein residues plays a critical role in enhancing complex stability. The cutoff angle and distance for H-bond analysis were set to 30° and 3.5 Å (Fig. 6). Trajectory. The compound positions and the interacting residues during the MD simulations were visualized by extracting the final frame. In addition to ceftazidime, the docked MD simulations poses of compounds CID:1 CID:2, CID:3, CID:4, and CID:5 were nearly identical to the redocked poses with the exception of CID:3, which abandoned its polar interactions with ASN501 to TYR505. Notably, the resulting docked position of ceftazidime was compatible with a previous investigation, which revealed that the residues SER494 and TYR505 play critical roles in the binding of SARS-CoV-2 to the hACE2 receptor 23 (Fig. 7).
Binding energy calculations. The MM-PBSA analyses were performed by extracting 500 snapshots of the stabilized frames for the 5 complexes as well as the control complex to calculate the binding energy average. A highly negative binding energy indicates stable binding of a small molecule to a protein. Compared to the RBD-ceftazidime complex (− 34.495 kJ/mol), the resulting binding free energies of the five ligand complexes gave values that Table 2. List of the interactions and binding affinities between the selected 5 compounds and SARS-CoV-2 RBD residues found during visualization of the complex structure by discovery studio visualizer. www.nature.com/scientificreports/  Table 3). The MMPBSA-based binding energy values of the identified ligands toward the SARS-CoV-2 RBD reflect that each ligand binds efficiently. This conclusion is further supported by data from other parameters, such as the RMSD, RMSF, number of HBs, and average number of HBs per frame calculated from the MD trajectories.
Drug-likeness and toxicity predictions. Drug-likeness predictions were conducted using SwissADME to calculate the physiochemical properties and then applying Lipinski's rule of five for comparison with ceftazidime, which is an injectable broad-spectrum antibiotic used to treat bacterial infections including lower respiratory tract pneumonia 43 . Violation of the implemented parameters of Lipinski's rule of five suggests increased absorption and permeation difficulties if the compounds are administered orally 39 . All compounds as well as the control compounds violated the molecular weight parameter (662.18, 578.83, 643.88, 539.59, 676.55, and  (Table 4). In addition to the drug likeness predictions, drug toxicity was predicted using the ProTox-II webserver. The quantities of each substance that resulted in death to 50% of the population (LD50) were determined and categorized according to the globally harmonized system of clas- www.nature.com/scientificreports/ sification and labelling of chemicals. The LD50 values of compounds CID:1, CID:2, and CID:4 (464, 705, and 1300 mg/kg, respectively) were predicted to belong to class 4 (300 < LD50 ≤ 2000). Compound CID:3 showed less toxicity with an LD50 of 3016 mg/kg and was categorized into class 5 (2000 < LD50 ≤ 5000). CID:5 was labeled as nontoxic compound with an LD50 of 10,000 mg/kg, and classified as class 6 (LD50 > 5000) ( Table 5). These results suggest that the bioactive compounds CID:1, CID:2, CID:3, CID:4, and CID:5 have lower gastrointestinal absorption properties compared with ceftazidime. In contrast to CID:5 and ceftazidime, which were predicted to be nontoxic substances, CID:1 and CID:2 may possess toxicity risks, and CID:4 and CID:3 are associated with less risk. www.nature.com/scientificreports/ Structural skeleton similarity analysis. The 40 compounds that gave best binding affinities were submitted to DataWarrior software analysis, which was utilized to count and categorize 8 structural skeleton parameters and investigate their similarities (Fig. 8). To define the flexibility of the compounds, we counted the number of intramolecular rotatable bonds and found that 29 of the compounds were composed of 5-9 rotatable bonds, including 7 out of the best 10 compounds. Electronegative atoms (N, O, S, F, Cl) within small molecules play critical roles in forming hydrogen bonds with protein residues; at least 7 of these atoms were counted in CID:16, and 28 compounds had 11-15 electronegative atoms. To determine the form of the electronegative atoms, we counted the number of H-bond donors and H-bond acceptors. Twenty-eight compounds contained 2-4 H-donor atoms, including 7 of the best docked compounds. In contrast, at least 4 H-acceptor atoms were counted in CID:27, and 28 compounds, including 7 of the best docked compounds, contained 9-13 H-acceptor atoms. Ring closures, which are mostly composed of carbon atoms involved in hydrophobic and electrostatic interactions, were counted, and we found 33 compounds containing 4 or 5 rings, including the best 9 compounds. To determine the form of the ring closures, we counted both carbo-and heteroring closures. Among  www.nature.com/scientificreports/ the carbo rings, we found 29 compounds containing 1-2 rings. In contrast to the carborings, 33 compounds contained 2-4 heterorings, including 7 of the best 10 compounds. Finally, we counted the number of aromatic carbon atoms, and the 40 compounds were distributed among 15 categories. Twenty-seven compounds were in the range of 17-27 aromatic atoms, including 8 of the ten best compounds. After we characterized and analyzed the 40 compounds that gave an affinity binding score of − 7.1 (kcal/mol) or less, we found no direct or inverse relationship between the parameters that we investigated and the binding affinities of the compounds. In addition, the desired range and cutoff of some of the parameter combinations could lead to an increase in the binding affinity of small molecules to the SARS-CoV-2 RBD, such as at least 7 electronegative atoms, 5-9 rotatable bonds, 4-2 H-donor atoms, at least 7 H-acceptor atoms, 1-3 carbo rings, and 2-4 hetero rings.

Conclusion
Our study aimed to find a prospective drug against SARS-CoV-2 and examine the similarities among the investigated compounds by utilizing compounds derived from the medicinal plant R. stricta. Applying a virtual screened approach, we identified five compounds from the R. stricta nonalkaloid extract. In comparison to the reference control (ceftazidime), the lead compounds exhibited remarkable binding affinities and strong interactions with key residues of the SARS-CoV-2 spike protein. The findings of this in silico study suggest that these compounds can be considered potential antiviral drugs to treat SARS-CoV-2 by interfering with the viral ACE2 recognition process. However, more experimental validation is required to confirm the antiviral activity of the selected compounds against the SARS-CoV-2 RBD. The two main outcomes from the structural skeleton analysis could help researchers design or narrow their search to find bioactive compounds targeting the SARS-CoV-2 RBD: electronegative atoms are preferable in the H-acceptor form, and ring closures are preferable in heteroatom form. However, a large compound population and more structural skeleton parameter analyses are required to reveal more recommended characteristics and confirm our findings.  www.nature.com/scientificreports/